Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More strategies for NumPy #472

Merged
merged 12 commits into from
Mar 8, 2017
Merged

Conversation

Zac-HD
Copy link
Member

@Zac-HD Zac-HD commented Feb 23, 2017

I've been using Hypothesis to test code using numpy for a while, and accumulated a few utility functions to generate diverse dtypes and arrays. This pull is actually much nicer, since I've taken the time to prod at some unusual dtypes and write a more general API. I hope that this will encourage more software testing in the scientific Python community!

@DRMacIver
Copy link
Member

Thanks for doing this!

I haven't looked into it in extreme detail yet (but I will). Immediate feedback on the desired points:

whether I'm using @st.defines_strategy correctly

Yes, looks correct to me.

how efficiently and effectively these strategies will produce values and minimise

So the numpy package already has problems with producing values and minimising (when the dimensions are large), which are mostly problems caused by some of the areas that Hypothesis is not currently very good at (basically generating large amounts of data). I've been thinking about ways to mitigate that - basically moving to something where the values it generates are sparse differences to some default value, but I haven't experimented in detail. You'd be welcome to either experiment with the same yourself or consider that out of scope for this as you prefer.

One area where you might want to investigate explicitly what minimizing looks like is when the dimensions come from a strategy. That sort of length parameter + drawing a fixed amount of data shrinks a lot better than it used to in Hypothesis, but it's still not as good as something that is more directly adapted to Hypothesis's approach. It should be at worst ok though.

the general API. In particular, I have different ways of giving a parameter range for bit-size of ints and floats, vs period of timestamps. Consistency may be better, but which if any should change?

I need to think about this a bit more. My feeling is that given the relatively small number of options, the explicit sizes parameter might be the better choice of the two.

what other tests would be useful for the new functionality?

I need to think about this a bit more too. It might be useful to have some shrink quality tests that assert different things shrink to particular small values.

(and why do they fail on 2.7?)

This is probably a compatibility issue. I'd recommend running the tests with pytest --pdb and seeing what the type being passed to struct that it doesn't understand is. My guess would be that it's an hbytes where it should be a str or something, but I'm not sure why that would be (getting byte behaviour consistent in Python 2/3 is awful).

@Zac-HD
Copy link
Member Author

Zac-HD commented Feb 24, 2017

Thanks!

I probably won't worry about efficiency so long as it's still usable, but writing some tests to confirm good shrinking behaviour sounds good.

@Zac-HD Zac-HD force-pushed the numpy-strategies branch 2 times, most recently from fd4ae01 to a79a04e Compare February 25, 2017 06:37
@Zac-HD
Copy link
Member Author

Zac-HD commented Feb 27, 2017

Turns out it was a type error naming Numpy subdatatypes after all. I've added some more tests, including minimisation.

And along with my typo, an unrelated flaky test. And here's another one.

@Zac-HD Zac-HD force-pushed the numpy-strategies branch 5 times, most recently from 3fbd4c5 to ae76154 Compare March 1, 2017 05:55
@Zac-HD Zac-HD changed the title More strategies for Numpy (WIP) More strategies for NumPy Mar 1, 2017
@Zac-HD Zac-HD force-pushed the numpy-strategies branch 4 times, most recently from 8f9e4b4 to 6e380f6 Compare March 1, 2017 21:29
@Zac-HD
Copy link
Member Author

Zac-HD commented Mar 2, 2017

I've gone back and squashed in changes to the tests which were otherwise flaky due to hitting the buffer overflow detector (yay it works), and linked to the travis runs that hit other flaky tests above. This has not been a great time for AWS to be down / Travis logs to be randomly irretrievable. I don't know why CircleCI thinks it's still running, if you click through the tests have passed!

@DRMacIver - I'm done here. All the tests are passing, I've finally realised that isort is sorting by length rather than lexically (why??), and there's even some lovely documentation. Closes #259, while I think about it.

@alexwlchan
Copy link
Contributor

Thanks for doing this!

I’ve skimmed the patch and it looks good, but other work means I won’t have time for a detailed review for about a week.

Copy link
Member

@DRMacIver DRMacIver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good in general. I started to add a bunch of comments in line, but they seem to mostly fall into two general patterns so I stopped commenting on every one.

  • arguments shouldn't be validated with assert, but instead should raise InvalidArgumentException. You may want to define some helper functions for e.g. checking one of a set of options.
  • There are a lot of places where you have boolean flags that I think should actually be separate strategies which people can then or together.

Other than those I think this looks pretty good. I'll need to have another pass through it afterwards, but I think those are the only big things.

Thanks again for doing this! I'm super excited to get the numpy support improved.

@@ -56,12 +72,19 @@ class ArrayStrategy(SearchStrategy):
def __init__(self, element_strategy, shape, dtype):
self.shape = tuple(shape)
assert shape
self.array_size = reduce(operator.mul, shape)
self.array_size = np.prod(shape)
if self.array_size * dtype.itemsize > settings().buffer_size:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can access this as settings.default.buffer_size rather than creating a new settings object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what I really want to do is get the size of the current buffer - you're saying that settings.default.buffer_size reflects any user settings too?

self.array_size = reduce(operator.mul, shape)
self.array_size = np.prod(shape)
if self.array_size * dtype.itemsize > settings().buffer_size:
raise ValueError((
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InvalidArgumentError from hypothesis.errors is probably the right thing to use here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll switch out all the asserts, but is a value drawn from a strategy conceptually an invalid argument? (I lean towards yes, but worth checking)

'Insufficient bytes of entropy to draw requested array. '
'shape={}, dtype={}. Consider increasing settings().'
'buffer_size to more than {} if slow test runs and very slow '
'minimization are acceptable.').format(shape, dtype,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly add "or reduce the dimensions of the array" as a suggestion?

assert endianness in ('?', '<', '=', '>')
if isinstance(sizes, int):
sizes = (sizes,)
assert sizes and all(s in (8, 16, 32, 64) for s in sizes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assertions are best not used for argument validation. These should all be "raise InvalidArgumentError" if the condition failed.

include these values in the real_sizes argument.

"""
assert real_sizes and all(s in (16, 32, 64, 96, 128) for s in real_sizes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about assertions for argument validation.


@st.defines_strategy
def timestamp_dtypes(max_period='Y', min_period='ns',
allow_datetime=True, allow_timedelta=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like datetime and timedelta should probably be independent top level strategies rather than boolean flags to a single strategy. It's not obvious to me that there are many circumstances where having a value which could be either one is a reasonable thing to do, and mixing them together is easy enough to do in those cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I'll split these up, along with the numeric type strategies. I think string types are confused often enough that it's useful to keep them in one strategy though.



@st.defines_strategy
def scalar_dtypes(integers=True, floats=True, times=True, strings=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me that these boolean arguments are an improvement over people writing one\_of(integer_dtypes(), ..., ) themselves. I can see that it's useful to have an 'any scalar dtype' strategy, but do you have a particular use case in mind where it's more desirable to exclude one or two types than it is to explicitly include the types you want?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular case, so back to a no-argument function. Composing your own allows for tweaking the individual types too, which actually does have usecases (eg 'anything 64bit').

assert isinstance(strat, SearchStrategy)
# And use it to fill an array of that dtype
assume(10 * dtype.itemsize <= settings().buffer_size)
arrays(dtype, 10, strat)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me what this arrays call is testing. Note that because of the way deferred strategies work it's not actually going to be running any of the array generation code. Did you maybe want to have an additional data() parameter to given and call data.draw on this so as to force a generation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would look something like this:

@given(nested_dtypes(), st.data())
 def test_infer_strategy_from_dtype(dtype, data):
     # Given a dtype
     assert isinstance(dtype, np.dtype)
     # We can infer a strategy
     strat = from_dtype(dtype)
     assert isinstance(strat, SearchStrategy)
     # And use it to fill an array of that dtype
     assume(10 * dtype.itemsize <= settings().buffer_size)
     data.draw(arrays(dtype, 10, strat))



def test_minimise_scalar_dtypes():
assert find(scalar_dtypes(), lambda x: True) == np.dtype(u'bool')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's the 'minimal' function in hypothesis.internal.debug that you probably want here.



@st.defines_strategy
def array_dtypes(subtype_strategy=scalar_dtypes(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confess I don't really understand what's going on here, but that may be more unfamiliarity with numpy in general and this corner of it in particular. I'd find it useful if there were some tests explicitly for array_dtypes that helped elaborate on what it was for/how to use it.

Copy link
Member Author

@Zac-HD Zac-HD Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll improve the docs here, but basically this is NumPy's equivalent to a namedtuple, or struct if you're coming from the C-side. Eg you might have an array of dtype([('time', M8[ns]), ('lat', f8), ('lon', f8)]) representing a GPS track.

To be honest I don't think many people will use this strategy - generally you have a pretty specific idea of what the structure should be - but it's useful to test from_dtypes and construct the nested case at least!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough.

@Zac-HD
Copy link
Member Author

Zac-HD commented Mar 2, 2017

@alexwlchan, thanks! I'm in no rush, and it's been a pleasure. Using Hypothesis to write tests for strategies is the most fun kind of testing I've done in a long time.

@DRMacIver, thanks for the feedback. I'll address the details that I want to discuss inline, and otherwise make the relevant changes. I'll also git rebase history into something pretty again.

"""Return a strategy for generating array (compound) dtypes, with members
drawn from the given subtype strategy."""
assert 1 <= min_size <= max_size
native_text = st.text if text_type is str else st.binary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedant point I just spotted: This is a native string, not a native text. The whole problem with the 2/3 split is that Python 2 strings aren't really text.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not pedantry, it's correctness. Especially important in this context...

@Zac-HD Zac-HD force-pushed the numpy-strategies branch from 3b822d6 to c365c73 Compare March 2, 2017 13:12
docs/numpy.rst Outdated


.. automodule:: hypothesis.extra.numpy
:members: arrays, array_shapes, scalar_dtypes, boolean_dtypes, unsigned_integer_dtypes, integer_dtypes, floating_dtypes, complex_number_dtypes, datetime64_dtypes, timedelta64_dtypes, string_dtypes, array_dtypes, nested_dtypes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is only here because arrays isn't picked up by Sphinx due to the @st.composite decorator. Any idea why?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. That sounds like a bug in composite. My guess would be that some relevant underlying property is not being copied over. __module__ maybe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that was pretty much my assumption. I don't think it's worth my time to dive into the details of decorators in this case where it's so easy to work around - I may have a look later, but not as part of this pull.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Not as part of this pull request is entirely the correct approach. I'll open an issue about this.

@Zac-HD Zac-HD force-pushed the numpy-strategies branch 3 times, most recently from da81c64 to 834567d Compare March 5, 2017 03:02
Copy link
Member

@DRMacIver DRMacIver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, thanks! Two comments in line.

@st.defines_strategy
def array_shapes(min_dims=1, max_dims=3, min_side=1, max_side=10):
"""Return a strategy for array shapes (tuples of int >= 1)."""
check_argument(1 <= min_dims <= max_dims, u'Invalid dimensions')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, my preference for error messages is that they should always include the values that were invalid (and ideally an explanation of why they were invalid). It makes debugging things a lot less annoying when using an API wrong.

I feel bad making you adhere to my level of pet peeve on this one though, so I'll let you decide whether you care enough to change this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I actually agree - when I get an error message I want it to be good enough that I don't need any other information to fix the problem.



@defines_dtype_strategy
def string_dtypes(byte_strs=True, unicode_strs=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do still think these should be separate. It's more consistent with the rest of the API, and I don't really want to encourage bytes/unicode confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Zac-HD added 3 commits March 5, 2017 22:20
np.prod is reduce-by-product, and avoids some imports.
Initialising an empty array is idiomatic as we're about to overwrite all
the zeros, and though irrelevant in this case can be faster for large
arrays.
Test with 'minimal' instead of 'find' - it's the internal equivalent,
with nice settings management and timeouts.
This is particularly useful in composite strategies, where generating
pairs of arrays with arbitrary complementary shapes is useful for many
tests.
Zac-HD added 8 commits March 5, 2017 23:14
Including a metastrategy for dtypes with a size and endianness.
These are unusual types, and often overlooked - which makes them perfect
to throw at code that is meant to handle anything.  Also support
strategy inference for these types.
Both bytestrings and unicode strings, to force clear consideration of
which are handled and how.
Compound (or 'array') dtypes are similar to namedtuples.
Now we can generate them too, and infer strategies for such objects.
Subarray dtypes (eg '(2,3)i4') are also supported, but not generated
with default arguments.

Closes HypothesisWorks#470
Because if anyone is using nested array dtypes - think 'nested structs'
- in production, they need all the help we can offer.
This is among the most common uses for the other strategies, so let's
support it out of the box.
@Zac-HD Zac-HD force-pushed the numpy-strategies branch from 834567d to 73471ca Compare March 5, 2017 12:23
Now lives on it's own page, like Django
@Zac-HD Zac-HD force-pushed the numpy-strategies branch from 73471ca to b8ab2ad Compare March 5, 2017 12:42
@DRMacIver
Copy link
Member

The documentation build is currently failing, I think because you forgot to update it after one of the changes. Other than that, LGTM!

@Zac-HD
Copy link
Member Author

Zac-HD commented Mar 5, 2017

Yep, forgot to change the members line when I split the string strategies - ironic given comments about leaving that in above. The next build is running now 😄, and once it's done you can merge.

Any timeline for a release?

@DRMacIver DRMacIver merged commit 4fceff1 into HypothesisWorks:master Mar 8, 2017
@DRMacIver
Copy link
Member

Thanks again for doing this!

RE timeline for release: To be honest we're overdue for a release. I'll try to make sure we get one out in the next week or so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants